Tracing Data Lineage Using Schema Transformation Pathways
نویسندگان
چکیده
With the increasing amount and diversity of information available on the Internet, there has been a huge growth in information systems that need to integrate data from distributed, heterogeneous data sources. Tracing the lineage of the integrated data is one of the current problems being addressed in data warehouse research. In this chapter, we propose a new approach for tracing data lineage which is based on schema transformation pathways. We show how the individual transformation steps in a transformation pathway can be used to trace the derivation of the integrated data in a step-wise fashion. Although developed for a graph-based common data model and a functional query language, our approach is not limited to these and would be useful in any data transformation/integration framework based on sequences of primitive schema transformations.
منابع مشابه
Using Schema Transformation Pathways for Data Lineage Tracing
With the increasing amount and diversity of information available on the Internet, there has been a huge growth in information systems that need to integrate data from distributed, heterogeneous data sources. Tracing the lineage of the integrated data is one of the problems being addressed in data warehousing research. This paper presents a data lineage tracing approach based on schema transfor...
متن کاملIncremental view maintenance and data lineage tracing in heterogeneous database environments
With the increasing amount and diversity of information available on the Internet, there has been a huge growth in information systems that need to integrate data from distributed, heterogeneous data sources. Automed (Automatic Generation of Mediator Tools for Heterogeneous database Integration) is a database transformation and integration system, which is designed to support virtual and materi...
متن کاملInvestigating a heterogeneous data integration approach for data warehousing
Data warehouses integrate data from remote, heterogeneous, autonomous data sources into a materialised central database. The heterogeneity of these data sources has two aspects, data expressed in different data models, called model heterogeneity, and data expressed within different schemas of the same data model, called schema heterogeneity. AutoMed is an approach to heterogeneous data transfor...
متن کاملLineage Tracing in a Data Warehousing System
A data warehousing system collects data from multiple distributed sources and stores the integrated information as materialized views in a local data warehouse. Users then perform data analysis and mining on the warehouse views. Figure 1 shows the basic architecture of a data warehousing system. In many cases, the warehouse view contents alone are not su cient for in-depth analysis. It is often...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003